Active Learning to Classify Email
نویسندگان
چکیده
While the technique of active learning has been applied successfully in improving text classification, its use in email classification has still not been explored. This paper examines several of the stateof-the-art algorithms for active learning with support vector machines as they are applied to email folder classification. We also introduce several extensions to these methods specifically designed to improve the quality of active learning when used for email folders. We evaluated the relative accuracy of these algorithms using a large publicly available email corpus. Our results show that current methods for active learning used in text classification work poorly for email foldering, but by taking chronological information, such as receipt time, into account, we can improve upon them significantly.
منابع مشابه
An Adaptive Congestion Alleviating Protocol for Healthcare Applications in Wireless Body Sensor Networks: Learning Automata Approach
Wireless Body Sensor Networks (WBSNs) involve a convergence of biosensors, wireless communication and networks technologies. WBSN enables real-time healthcare services to users. Wireless sensors can be used to monitor patients’ physical conditions and transfer real time vital signs to the emergency center or individual doctors. Wireless networks are subject to more packet loss and congestion. T...
متن کاملLearning to Classify Email into "Speech Acts"
It is often useful to classify email according to the intent of the sender (e.g., "propose a meeting", "deliver information"). We present experimental results in learning to classify email in this fashion, where each class corresponds to a verbnoun pair taken from a predefined ontology describing typical “email speech acts”. We demonstrate that, although this categorization problem is quite dif...
متن کاملLearning to Classify Email into Speech Acts
It is often useful to classify email according to the intent of the sender (e.g., "propose a meeting", "deliver information"). We present experimental results in learning to classify email in this fashion, where each class corresponds to a verbnoun pair taken from a predefined ontology describing typical “email speech acts”. We demonstrate that, although this categorization problem is quite dif...
متن کاملActive Learning with Boosting for Spam Detection
Spam detection algorithms have been developed to train in a large enough set of labeled data and predict with a high accuracy of 95% if an email is spam or not. A problem that arises in this setting is that labeling examples is a costly process. It requires humans to read them one by one and classify them. Active learning is a learning approach developed to address this problem. It learns a sma...
متن کاملEmail Classification and Summarization: A Machine Learning Approach
This paper presents the design and implementation of a system to group and summarize email messages. The system uses the subject and content of email messages to classify emails based on users’ activities and generate summaries of each incoming message with unsupervised learning approach. Our framework solves the problem of email overload, congestion, difficulties in prioritizing and difficulti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005